Using Quarto for everything

Lucas A. Meyer

2022-07-07

Why Quarto

The content value chain

Content stuck in my computer is nearly worthless.

Goal

  • Move good content out of my computer as fast as possible
  • Reproducible
  • Git-based collaboration
  • Write once, generate:
    • Code
    • Paper
    • PowerPoint
    • Site/documentation

Literate Programming

Donald E. Knuth proposed literate programming in a 1984 article.

One of my proudest tech moments was to make CWeave and CWeb (and LaTeX) run in a Windows computer circa 1998.

Jupyter implements the literate programming paradigm, but I haven’t seen the Markdown part gain a lot of traction.

I tried many tools for Literate Programming

LaTeX

  • Great for PDFs… Ok, Beamer!
  • Website generation is not great
  • Dynamic content requires Latex programming
    • \usepackage{ifthen}
    • @for, @while

Word/PPT

  • Hard to collaborate before O365
  • Hard to reproduce / auto-generate

Jekyll

  • Great for sites
  • Not great for PPT, papers

RMarkdown

  • Great PDFs
  • Mostly good presentations
  • Hugo and blogdown work well
  • Heavily dependent on R R

Python notebooks

  • Great, with Pandoc

What Quarto really is

The content pipeline for .ipynb

About 75% of data scientists use Python through Jupyter notebooks, and one can use pandoc to generate papers and PowerPoint, but it can be complicated.

graph LR
    A[.ipynb] --> B(("Pandoc"))
    B ----> E[.doc]
    B ----> H[.pptx]
    B --> C[.md]
    B --> D[.tex] 
    D --> F((Xetex))
    C --> I((Hugo))
    F --> G[.pdf]
    I --> J[.html]
    style B fill:#FF6655AA
    style F fill:#88ffFF
    style I fill:#88ffFF

Quarto for Python, in a nutshell

In Quarto’s .qmd files, you write Markdown and code, just like .ipynb. Add some YAML configuration and Quarto does the intermediate steps. It integrates well with VSCode and Jupyter.

graph LR
Q[.qmd] --> A
subgraph Quarto
    A[.ipynb] --> B(("Pandoc"))
    B --> C[.md]
    B --> D[.tex] 
    D --> F((Xetex))
    C --> I((Hugo))
    style B fill:#FF6655AA
    style F fill:#88ffFF
    style I fill:#88ffFF
end
    B ----> E[.doc]
    B ----> H[.pptx]
    F --> G[.pdf]
    I --> J[.html]

But wait, there’s more!

Quarto can easily run pre-scripts and post-scripts. I frequenly use this to pre-process data and to automatically publish output to git repositories.

graph LR
    P[Pre-scripts] --> Q
    style P fill:#AA99FF
    Q[.qmd] --> A
    subgraph Quarto
        A[.ipynb] --> B(("Pandoc"))
        B --> C[.md]
        B --> D[.tex] 
        D --> F((Xetex))
        C --> I((Hugo))
        style B fill:#FF6655AA
        style F fill:#88ffFF
        style I fill:#88ffFF
    end
        B ----> E[.doc]
        B ----> H[.pptx]
        F --> G[.pdf]
        I --> J[.html]
    E --> X[Post-scripts]
    H --> X
    G --> X
    J --> X
    style X fill:#AA99FF

Using Quarto

The YAML front-matter

Whether you use Quarto from .qmd, .ipynb, or .Rmd files, you always start with a YAML front-matter file.

The YAML configuration determines what’s the output format of your document. A few popular output options are html, pptx, docx, and pdf.

You can use a single source file to generate multiple output types.

For example, the YAML on the right will generate a PowerPoint file and a Revealjs presentation.

---
title: "Using Quarto for everything"
format: 
    pptx:
        reference-doc: templates/template.pptx
    revealjs:
        incremental: false
        theme: pulse

author: Lucas A. Meyer
date: 2022-07-07
---

Main content

Writing the main content

Most writing in Quarto is done in Markdown.

Quarto’s Markdown supports everything I’m used to: figures, tables, bibliography, etc.

It also supports lots of extra features, like diagrams with mermaid and GraphViz and even LaTeX equations:

\[ E = mc^2 \]

### Writing the main content

Most writing in Quarto is done in [Markdown].

Quarto's Markdown supports everything I'm
used to: figures, tables, bibliography, etc.

It also supports lots of extra features, like
diagrams with `mermaid` and `GraphViz` and
even LaTeX equations: 

$$
E = mc^2
$$

What if I want to add code?

The best thing about Quarto is that you can use it to run any code that you would be able to run in a Python notebook.


import numpy as np
import matplotlib.pyplot as plt

r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(subplot_kw=\
                {'projection': 'polar'})
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()

Diagrams

You can use mermaid to create diagrams.

Here’s the first example from Mermaid’s website. The diagrams in previous sections were created with mermaid.

flowchart LR

A[Hard] -->|Text| B(Round)
B --> C{Decision}
C -->|One| D[Result 1]
C -->|Two| E[Result 2]

flowchart LR

A[Hard] -->|Text| B(Round)
B --> C{Decision}
C -->|One| D[Result 1]
C -->|Two| E[Result 2]

Regression and results

# Load the data
df_wage = pd.read_csv("data/wage1.csv")

# Create an OLS model using the R syntax - assumes an intercept
mod = smf.ols(formula="wage ~ educ",
              data=df_wage)

# Fit the model
res = mod.fit()

# Show the results
display(Markdown(md(res.summary().
        tables[1].as_html())))
coef std err t P> t
Intercept -0.9049 0.685 -1.321 0.187 -2.250 0.441
educ 0.5414 0.053 10.167 0.000 0.437 0.646

Presentations in Quarto

Basic slide syntax

To create slides, you create sections with #, titles with ##, and bullets with -.

Content types

  • You can add several types of content
    • code (use backticks)
    • images
    • diagrams
    • tables
    • etc.
To create slides, you create sections 
with `#`, titles with `##`, and bullets 
with `-`.

Quarto will render your content in slide form.

### Content types

- You can add several types of content
    - code (use backticks)
    - images
    - diagrams
    - tables
    - etc.

Creating PowerPoint slides

To generate a presentation from a .qmd file, add format: pptx to the YAML front-matter.

The part I liked the least is that Quarto will use the pandoc PowerPoint rules to render the content from the .qmd into the .pptx.

The “pandoc rules” substantially limit the flexibility you would have in PowerPoint presentations. Quarto has better presentation support for revealjs and beamer.

  • Quarto can use a template with (only) these layouts:
    • Title Slide
    • Title and Content
    • Section Header
    • Two Content
    • Comparison
    • Content with Caption
    • Blank

PowerPoint layout rules

The rules are available at:
https://pandoc.org/MANUAL.html#powerpoint-layout-choice

  • Title Slide: created from metadata fields like title and author
  • Section Header: created from the top-level markdown headings (for example, #)
  • Two Content: used when .md source contains .columns div (:::: {.columns}) and text content
  • Comparison: same as “Two Content”, but content of divs is not text
  • Blank: used for slides that have no displayable content (e.g. notes)
  • Content with Caption: used when content doesn’t have a columns div but has text and non-text content
  • Title and Content: whatever doesn’t fit the rules above.

PowerPoint templates

By adding a reference-doc entry to your YAML, you can tell Quarto (and pandoc) to use a file as a template for the format of your presentation.

The “Slide Master” needs to contain layouts named as per the previous slide (e.g. “Comparison”).

This allows you a lot of flexibility in the design of your slide deck, even if it is for just the small number of layouts that were listed in the previous slide.

You can control fonts, add background images, page numbering, etc.

---
title: "Using Quarto for everything"
format: pptx
reference-doc: templates/template.pptx
author: Lucas A. Meyer
date: 2022-07-14
---

Best feature: generate content dynamically

Let’s say you’re presenting a project about population dynamics but you don’t know which world leaders are coming to the conference.

On the presentation day, you learn that Belgium, China, Brazil, India, Japan and Nigeria are attending.

You can use Python or R to automatically generate slides.

Generating slides with Python

The next slides/sections were generated using the code below:

df_dr = pd.read_csv("data/dr.csv.gz", compression="gzip")
df_pop = pd.read_csv("data/pop_brackets.csv.gz", compression="gzip")

years = [2000, 2025, 2050, 2075, 2100]
regions = ["Belgium", "China", "Brazil", "India", "Japan", "Nigeria"]

for name in regions:
    display(Markdown(f"## Age and Population Pyramids for {name}"))
    display(Markdown(f'<div class="columns">'))
    display(Markdown(f'<div class="column">'))
    plot_dependency_ratio(df_dr[df_dr.Location == name])
    display(Markdown(f'</div>'))
    display(Markdown(f'<div class="column">'))
    plot_population_pyramid_series(df_pop[df_pop["Location"]==name], years)
    display(Markdown(f'</div>'))
    display(Markdown(f'</div>'))

Age and Population Pyramids for Belgium

Age and Population Pyramids for China

Age and Population Pyramids for Brazil

Age and Population Pyramids for India

Age and Population Pyramids for Japan

Age and Population Pyramids for Nigeria

Generating a website

What I could get by just changing the format in YAML

---
title: "Using Quarto for everything"
format: html
    # revealjs:
    #     incremental: false
    #     theme: [simple, revealjs-customizations.scss]
    #     title-slide-attributes:
    #         data-background-image: images/data-viz-bg.jpg
    #         data-background-size: contain
    #         data-background-position: right

author: Lucas A. Meyer
date: 2022-07-14
---


Adding or changing the format to html will create a website.

Screenshot of website

Scholarly articles

Generating a scholarly article

I reused some of the content of this presentation to create two scholarly-looking articles. The purpose of the articles is just to show how easy it is to generate them with Quarto, they don’t contain original research.

The relevant files are:

Scholarly article screenshots

Citations and Footnotes

Citations don’t work on presentations, but are easy to add to documents.

You need a BibTex file, e.g., bibliography.bib, and a reference to it bibliography: references.bib to the YAML front-matter. Quarto supports any Citation Style Language.

You can cite by using [@citation-name] in your text. Please check the article .qmd source and the PDF and DOCX outputs.

Generating footnotes is also easy. Using [^ref] links to a footnote, and [^ref: content of the footnote] generates its content1.

Books in Quarto

Books in Quarto

Here are few books that have been recently written with Quarto:

Should I use Quarto?

Where I think Quarto is good (July 2022)

Articles: maybe yes

I think Quarto is more helpful for a team that already uses Git with Python notebooks or LaTeX to write articles. Microsoft Word collaboration through SharePoint and Teams is easier than Git and Quarto… but it’s not reproducible.

Python notebook: excellent

Quarto adds to Python notebooks without detracting anything. All you need are a few YAML lines.

Blog: excellent

Quarto allows me to have a scriptable, Python-based blog. I can automate my blog to tweet and post to LinkedIn when I write new articles.

Presentations: maybe not

Only if you have

  • a lot of dynamic content
  • Reproducibility needs
  • Collaborators used to Git/Beamer

THANK YOU